5  Technology space

5.0.1 Feature Importance Technolgy Space

A natural approach to constructing technology networks would be to use patent citation data: if Patent A cites Patent B, and they contain different technologies, this reveals a relationship between those technology domains. However, this approach presents three critical limitations for our purposes. First, EPO citations include examiner-added citations that may not reflect actual knowledge flows between inventors. Second, aggregating patent-level citations to technology-level relationships relies on co-occurrence patterns that obscure the directionality of knowledge dependencies—whether technology A enables B or vice versa. Third, citation networks are inherently backward-looking, reflecting past relationships rather than predicting future technological trajectories.

To address these limitations, we adopt the Feature Importance Product Space (FIPS) methodology (Fessina et al. 2024). Rather than inferring relationships from citation co-occurrence, FIPS uses machine learning to identify which existing technology specializations predict future specialization in other technologies. Specifically, we train random forest models where the current presence of regional expertise in technology T is predicted by past expertise patterns across all other technologies. The resulting feature importance scores reveal directional, predictive relationships: if expertise in technology A strongly predicts future expertise in B, this indicates A is a “stepping stone” toward B, even if the reverse is not true. This asymmetric structure captures hierarchical technological dependencies that symmetric co-occurrence measures would miss, and its forward-looking nature aligns with our focus on how current network positions influence future patent value.

Following (Fessina et al. 2024), we quantify specialization patterns by means of the Revealed Comparative Advantage(RCA) (Balassa 1965). Although the RCA is mainly designed for use with international trade data, it has also been adopted in the literature of the geography of innovation following the Balland nomenclature (balland2017geography?) we will refer to this metric as the Revealed Technological Advantage(RTA) instead. The RTA measures a location’s relative specialization level in a given technology which enables us to capture both expertise and diversity when we aggregate all the technologies for each location. This measure, proved useful in determining complex and non-linear relationships between products/activities. Simply put, the RTA quantifies simultaneously the relative level and the quality of co-occurrence, which reduces the noise in the network data. Although some papers criticise the use of the RCA/RTA with patents classes [pinheiro2025], we think it fits our objective in capturing meaningful relationships between technologies when we use them in the framework of (Fessina et al. 2024). We compute these measures to obtain for each year a matrix denoting the regions in its rows and the technologies in its columns. We formalize it as follows: Let \(X_{r,t,y}\) be the measure of activity (patent counts) of region \(r\) in technology \(t\) during year \(y\). Where \(\mathcal{T}\) is the set of technologies, \(\mathcal{Y}\) is the set of years, and \(\mathcal{R}\) is the set of regions.